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We present two analyses dedicated to measure the ratio of branching ratios of the top quark, R = B(t — » 
Wb)/(t — » Wq) ( where q = d, s, b), using ttbar events with either one or two prompt isolated leptons (c or mu) 
in the final state. Furthermore the framework of the dileptonic analysis was used also for a feasibility study of 
the measurement of b-tagging efficiency, by assuming the R value to be the Standard Model one. Data-driven 
techniques to control the background in the selected events are discussed and the expected simulation results 
are presented. 



1. Introduction 

Top quarks decay mostly to Wb, while the fi- 
nal states Wd and Ws are suppressed by the 
square of the CKM matrix elements \V t d\ and |V ts |. 
Besides single top studies, \V tb \ can be obtained 
also through top pairs production, by measuring 
R = B(t — > Wb)/(t -> Wq), with q = d,s,b, and 
assuming that exactly 3 generations of quarks exist, 
as the Standard Model (SM) predicts; indeed, by 
imposing the unitarity of the 3x3 CKM matrix, such 
ratio is R = \V tb \ 2 / \\V td \ 2 + \V ts \ 2 + \V tb \ 2 ) = \V tb \ 2 . 
Without any assumption on the number of genera- 
tions of quarks, an R measurement is still useful to 
put constraints on V tb and, more importantly, it can 
give a clue on the existence of a fourth generation; 
indeed in such scenario, R is appreciably less than 
the SM value [1]. The most recent R measurement 
obtained by CDF with - 162 pb" 1 is R > 0.61 
at 95 % C.L. [2]; D0 measured R simultaneously 
with the tt cross section and obtained the value 
R = 0.97±g;° 9 08 and a limit R > °- 79 at 95 % C - L - 
with ~ 900 pb _1 [3]. The direct measurement of 
the CKM element |V t6 | (predicted by the SM as 
\V tb \ = 0.999133t™£|) [4]) is possible only by 
means of the study of single top production and 
currently the only available measurements are from 
D0 [5] and CDF [6] experiments. In the CMS experi- 
ment [7] , two feasibility studies of the R measurement 
have been carried on, one using selected scmilcptonic 
tt events [8] and described in Sec. 3, the other using 
selected dileptonic tt events [9] as described in Sec. 4. 
Both the analysis use data-driven methods in order to 
estimate the irreducible background contribution and 
consider the number of b-tagged jets as the physical 
observable, therefore the b-tagging efficiency must 
be fixed to a value obtained from an independent 
measurement. Furthermore, the framework of the 
analysis can be used also to the aim to perform a 
measurement of b-tagging efficiency by assuming the 
R value to be the SM one; such study was performed 
for the dilepton channel and is described in Sec. 4.3.2. 



2. General Method 

The parameter R = B(t — > Wb)/B(t -> Wq) is 
measured by counting the number of jets originating 
from &-quark (6-jets) in tt events. The number of b- 
tagged jets depends, beyond the R value itself, on the 
^-tagging efficiency (e b ) and the mis-tagging probabil- 
ity (e q ). Therefore, the probability to have a given 
number i of 6-tagged jets is a function of R, e b and 
e q . It is called ei(R;e b ,e q ) and can be expressed as 
follows: 

e 2 (R;e b ,e q ) = R 2 P t {tt^ bWbW) 
+2R(l - R)Pi{U -» bWqW) 
+{l-R) 2 Pi{tt^qWqW) (1) 

where q can represent an s or d quark and each Pi 
(probability for a definite tt decay of having i 6-tagged 
jets in the final state) depends on e b and e q . This 
function is used to fit the distribution of the number 
of 5-tagged jets (n btag ) to measure the value of the 
R parameter. In order to identify the flavor of the 
jets, specific algorithms are used. For this study, the 
Track Counting (TC) and Jet Probability (JP) [10] al- 
gorithms are used to tag the b-jets. The efficiency of 
the TC and JP algorithms can be measured in QCD 
events with reconstructed jets containing muons. The 
PTrei method [11] exploits the distribution of the rel- 
ative transverse momentum of the muon with respect 
to the jet to estimate the number of 6 jets present in 
data. 



3. Semi-leptonic tt analysis 

The final state of the semi-leptonic tt decay channel 
(one W — > qq' and the other W — > lv\ ) is character- 
ized by two quarks coming from the direct decay of 
top quarks, two quarks coming from the decay of one 
W and a lepton and a neutrino from the other W de- 
cay. Therefore the final experimental signature is four 
or more jets, a single lepton (electron or muon) and 
missing transverse energy. The generation of Monte 
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Carlo signal and background samples is described in 
[8]. The following results refer to an integrated lumi- 
nosity of 1 fb _1 . 

3.1. Selection and Event Reconstruction 

The selection starts with the High Level Trigger 
(HLT) requests: a lepton with enough large pt {pt > 
15 GeV for muons or pt > 18 GeV for electrons). 
The details of the physics objects reconstruction are 
in [8] and references therein. Offline electron re- 
construction and identification is performed by using 
tracker and electromagnetic calorimeter information 
and the muon reconstruction uses both tracker and 
muon chambers sub-detectors information. An isola- 
tion variable for the leptons is defined as the ratio 
between the sum of pt of the tracks and energies of 
calorimetric deposits around the candidate and the 
Pt of the lepton candidate itself. The lepton candi- 
date must have such isolation variable less than 0.1 
and pt > 30 GeV/c. If more than one lepton is se- 
lected, the event is rejected. The jet reconstruction 
algorithm uses the calorimetric energy deposits with a 
seed threshold of E = 1 GeV and performs an iterative 
cone procedure with radius AR = \J Ad) 2 + An 2 = 
0.5. The jet candidates are selected by requiring 
Et > 40 GeV and |?7| < 2.4; in order to reject fake 
jets, they are required to have the fraction of electro- 
magnetic energy to the total energy less than one and 
to be far enough from the lepton candidate (lep) by 
imposing AR{jet, lep) > 0.5. The missing transverse 
energy ($t) used in this analysis is computed by per- 
forming the vectorial sum of the energy deposits in 
the calorimeters. The reconstruction of neutrino mo- 
mentum is needed to compute the leptonic top quark 
mass; the transverse component comes from the JSt 
value while the longitudinal component is determined 
from the four-momentum conservation of the W bo- 
son decay. A useful kinematic variable to reduce the 
background contamination is Ccntrality. It represents 
the fraction of the hard scattering going in the trans- 
verse plane and it is defined as: 

Centrality = k T (2) 

V(£e) 2 -(Ep z ) 2 

It is required to be larger than 0.35. The final step 
of the event reconstruction is the computation of the 
invariant masses using the selected reconstructed ob- 
jects. Among the selected jets, the four with largest 
Et are considered as coming from the decays of the 
two top quarks and of the hadronic W. While the 
selected lepton and the missing energy are known to 
come from a W decay, the assignment of the four cho- 
sen jets to the partons has to be determined. In order 
to choose the right combination, a two step associa- 
tion is used. Beforehand the masses and the widths 



of the hadronic W boson and the tops are obtained 
from simulation. The distributions of the three invari- 
ant masses of the reconstructed objects well matched 
to the generated particles are used to obtain the pa- 
rameters mwhad, m tHad , rritLep, <j{m whad ), a(m tHad ) 
and o-(m t Lep)- First the hadronic W boson is recon- 
structed by computing the invariant mass of every 
pairs of jets among the four. The pair with the near- 
est invariant mass to the W one, namely ij, is chosen. 
The following cut, is applied: 

\rriij - m whad \ < a(m W had) (3) 

The second step is the association of the two remain- 
ing jets (k and p) to the partons coming from the 
direct decay of top quarks. To this end a % 2 based on 
the two top quarks masses is defined: 

2 _ ( m ijk - m tHad \ ( mi vp - m tLep \ 

y 0-(m t Had) J \ 0-{m t Lep) J 

where i and j are the 2 jets chosen as coming from the 
W boson decay. Now the only combinatorial ambigu- 
ity lies in the choice of which one of the two remaining 
jets is associated to which of the two top quark. The 
association that minimizes the x 2 is assumed to be the 
correct one. We consider the events with a large Xmin 
as events which are wrongly reconstructed, so the cut 
Xmin < 4 is applied. 

After the whole selection, the expected event num- 
ber with an integrated luminosity of 1 fb^ 1 is 2650 
for semileptonic tt, while the main background pro- 
cesses are: 109 for other tt, 260 for W + jets, 52 for 
Z + jets, 52 for tW and 56 for QCD di-jet. The 
6-tagging algorithm adopted in this analysis is the 
Jet Probability [10] and the chosen working point is 
such that e b = (79 ± 1)% and e q = (13 ± 1)%. 

3.2. Background Subtraction 

The Xmin defined in Eq. 4, and referred to as 
Xnormai * n the following, has a peak at low values of 
X 2 for correctly reconstructed semileptonic tt events, 
called signal in the following. Background and in- 
correctly reconstructed ti events {Background in the 
following) lead to low values of Xnormai om y due *° 
random combinatorics. Therefore if the direction of 
one of the selected jets is artificially changed, the 
mass x 2 distribution should remain the same for back- 
ground events, while we expect the distribution for 
signal events will appreciably change. We can define 

a xlandom J ust likc the xlormah but computed by as- 
signing a random direction to one of the two jets con- 
sidered as coming directly from the tops. We decided 
to change the direction of the one with highest trans- 
verse energy. Uniform distributions for <f> and n have 
been generated, allowing for <f> in the range {—w, n) 
and n in the range (—2.4, 2.4), as for the selected true 
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jets. Then the x 2 procedure was repeated leading up 
to the new combination that gives the minimum x 2 , 
called Xrandom- Fig- 1 shows the distribution of the 
Xmin variable separately for signal and background 
events. The Ubtag distribution of the events selected 
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Figure 1: Up: 
(as defined in the text). Down: Xmin distribution of the 
complete background sample. Both the figures show the 
xlormal (solid) and the xl andom (dashed) distributions. 



after the cut Xnormai < 4 will be referred as n%°™ 
while the events selected after the cut Xrandom < ^ 
will be referred as n££^ dorn ; Fig. 2 (Upper panel) shows 
the result of the n^ t °^ al -n r b ^ g dom subtraction for the 
signal sample (solid) and for the background sam- 
ple (dashed), the latter is compatible with a fiat zero 
distribution. Therefore, it is clear that if one con- 
siders the whole data sample, containing signal and 
background events, and computes bin-by-bin the dif- 
ference of the Normal and Random distributions, the 
resulting nu a g distribution will be proportional to the 
distribution of the signal only, as Fig. 2 (down) shows. 
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Figure 2: Up: nfiZ^'-ntfZf ™ distribution for signal 
(solid) and background (dashed) events normalized to 
L = 1 ftr 1 . Down 



n— l - n Ma a d ° m distribution for 



the whole data sample (solid) and n™™ al distribution of 
the only signal (dashed) normalized to unity. 



3.3. Fit Results 

The distribution resulting from the bin-by-bin sub- 
traction of the whole data sample, after normalization, 
is to be fitted with Eq. 1. In order to check the effec- 
tiveness of the method the fit was repeated assuming 
several R values. Different values of R (R gen ) were 
generated in the range [0.9, 1] by properly weight- 
ing three samples where the decay of tt was forced: 
tt -> WbWb, tt -> WbWq, tt -» WqWq. The sta- 
tistical uncertainty remains steady in all the range 
and it is aji(stat) = 0.12. The measured values of R 
agree within the statistical uncertainty with R gen in 
the range R gen — [0.9, 1]. 

3.4. Systematic Uncertainties 

The various uncertainties were estimated based on 
the anticipated knowledge of the CMS experiment af- 
ter 1 fb _1 of integrated luminosity [8]. All system- 



4 



Proceedings of the DPF-2009 Conference, Detroit, MI, July 27-31, 2009 



atic contributions were assumed to be uncorrelated, 
therefore the total systematic uncertainty has been 
computed by square summing. In order to check the 
impact of on R measurement, its value was var- 
ied by 4%. Since the number of ^-tagged jets from 
Eq. 1 docs not take into account the presence of b- 
jets from radiation, while the value of e b measured in 
real data [12] does, a contribution due to such bias 
is considered. The systematic uncertainty associated 
to the jet energy scale is estimated by shifting the 
calibrated transverse energy for each jet used in the 
analysis by a relative 5%. The effect of the difference 
in the selection efficiency between ti — > WbWb (ebb) 
and it — > WqWq (e qq ) was estimated by varying ebb 
by e b b-£ qq (=0.04%). The x 2 cut was varied by ±0.5. 
The systematics study results for each source are sum- 
marized in Table I together with the total value. 

Table I Contributions to systematic uncertainty. 



systematics 


@sys 


b tagging efficiency 


0.04 


b tagging efficiency bias 


0.04 


Jet Energy Scale 


0.09 


X 2 cut 


0.02 


Selection efficiency 


0.006 


total 


0.11 



4. Dileptonic tt analysis 

This study considers ti events were both the W de- 
cay to leptons, the final state with an electron and a 
muon was chosen, as it is the channel with the largest 
cross section and smallest background. The genera- 
tion of Monte Carlo signal and background samples is 
described in [9] . All the results refer to an integrated 
luminosity of 250pb _1 . 

4.1. Event Selection 

The event selection is tuned to identify leptonic fi- 
nal states with two prompt, isolated leptons with high 
transverse momenta in the CMS detector. The selec- 
tion is detailed in [9]. Data samples are triggered by 
requiring a non-isolated single muon (j>t > 9 GeV/c) 
or a single electron (Et > 15 GcV). Lcpton candi- 
dates are reconstructed with pt > 20 GeV/c in the 
fiducial region |ry| < 2.4 of the detector and must 
satisfy identification and isolation requirements. The 
leptons are required to be separated by AR > 0.1. 
In the case of multiple selected leptons, the ambigu- 
ity is resolved by selecting e[i candidates with oppo- 
site electric charge and highest transverse momenta. 



Jets are reconstructed using the seedless infrared-safe 
cone algorithm and are required to have at least two 
calorimeter towers with a minimum E T sum of 2 GeV. 
Jets are required to have at least one assigned track so 
that the 6-tagging algorithms can be applied. These 
cuts define the "taggability" requirements. The en- 
ergy of the jets is corrected for the ^-dependence 
and absolute Et using MC based corrections for gen- 
erator level jets. Taggable jets arc selected with 
E T (corrected) > 30 GeV/c and |r/| < 2.4. Jet candi- 
dates are further required to be separated from the se- 
lected leptons by AR(jet, lepton) > 0.3 and to have an 
electromagnetic fraction EMF<0.98. The total miss- 
ing transverse energy, Et, is corrected for the energy 
deposited by muons and it is required to be above 
30 GcV. With 250 pb -1 of integrated luminosity, af- 
ter the described selection, the expected event number 
is 787 for dileptonic tt, and the main background con- 
tributions are due to other ti (14 events), single top 
(29 events), Di-boson (10.5 events), W/Z + jets (26 
events). Therefore after the selection a signal to back- 
round ratio of approximately 10 is expected. 

4.2. The jet misassignment estimate 

Despite small contributions from other background 
processes there is a non-negligible probability that at 
least one jet from a ti decay is either missed because 
it was not reconstructed or because it did not pass the 
jet selection criteria, and another jet is chosen instead 
(such as, for example, jets from ISR/FSR). This will 
be referred to as "jet misassignment" and an estimate 
of the jet misassignment level has to be made from 
data. The estimate is done in terms of probability 
weights on, where i = 0, 1, 2 is the number of jets from 
top decays correctly reconstructed and selected. The 
selected events are a combination of three different 
categories: 

• events with no jet selected from the top decays, 
weighted by ceo (background-dominated); 

• events with only one jet correctly assigned to 
the top decay, weighted by a.\ (combination of 
signal and background); 

• events with two jets correctly assigned to the 
top decays, weighted a 2 (signal-dominated) . 

In first approximation the weights on can be param- 
eterized in terms of a binomial combination of a, 
the probability of correctly assigning individual jets. 
The value of a can be estimated using the kinematic 
properties of the events directly from data. A cor- 
relation can be sought in the lepton-jet pairs origi- 
nating from the same top quark decay [13] and it is 
possible to show that no pair with M Lb > MY l h ax = 
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ymj — — 156 GeV/c 2 should be observed (spec- 
trum cndpoint). Two methods are proposed to emu- 
late the invariant mass distribution of the misassigned 
jets: "swapping" the jet in the assigned lepton-jet 
pair, with a jet from a different event, or "randomly 
rotating" the momentum vector of the selected lep- 
tons. As the "random rotation" and "swap" methods 
yield similar results, the average value is used to model 
the invariant mass distribution of the background jets. 
The distribution of the "swapped" and "randomly ro- 
tated" pairs, normalized to fit the high-end part of the 
distribution, is superimposed. The two background 
models provide a good estimate of the fraction of mis- 
assigned pairs with Mi epton j et > 190 GeV/c 2 (Fig. 3). 
The normalization factor applied to the distribution 
of the swapped (randomly rotated) pairs is related to 
the misassignment fraction, 1 — a [9]. 
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Figure 3: Invariant mass of the lepton-jet pairs for the 
whole data sample. 



4.3. Measurements by fitting the n btag 
distribution 

The following subsections describe the measure- 
ment of R and of the b-tagging efficiency respectively; 
for both the measurements a fit of the nbtag distribu- 
tion is performed with the function in Eq.l as in the 
scmi-lcptonic analysis, but here it depends also on a 
(besides R, e&, e q ). In both the studies the ct2 = a 2 
parameter is fixed to the value obtained by data as 
explained above, ao is a free parameter and ot\ is 
obtained from the normalization (ai=l-ao-a2). The 
value obtained for a.2 is ct2 — 0.67 ± 0.07 (stat)±0.03 
(syst), while the one obtained by using the MC truth 
is 0.63±0.02. € q is fixed to the value obtained by other 



data driven methods [14] and the other parameters are 
fixed or free depending on the study. 

4.3.1 . Measurement of R 

In order to measure R, tb was fixed; the results 
for Rg erl = 1, for the two b-tagging algorithms (JP 
and TC), each for three working points, are shown in 
Tab. II Figure 4 shows the results obtained by fitting 



Table II R fit results for an integrated luminosity of 
L — 250 pb _1 . Statistical uncertainties from the fit and 
from MC truth are included. 



b-tagging algorithm 


Working point 




loose medium tight 


Jet Probability 


1.01 ± 0.02 1.00 ± 0.02 0.97 ± 0.03 


Track Counting 


1.00 ± 0.02 0.99 ± 0.02 1.04 ± 0.03 



R and ao using jets tagged with the JP loose point. 
Different subsamplcs with forced decays, are weighted 
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Figure 4: Fit to R and ao, only statistical uncertainties 
are shown. 

similarly to the semi-leptonic study (Sec. 3. 3) in order 
to give different R gen values. All backgrounds are in- 
cluded. The b-tag multiplicity distributions obtained 
this way are sampled according to the expected num- 
ber of events and fit to determine R. The statistical 
uncertainty of each fit result is then determined from 

the Width Of the distribution Of Rgenerated,- Rmeasured- 

The systematic uncertainties are dominated by the 
uncertainty on the 6-tagging efficiency. The total un- 
certainty is o~R(stat + sys) = 0.09 with 250 pb -1 . 
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4.3.2. Measurement of Wagging efficiency 

Here R = 1 is fixed and the ^-tagging efficiency, 
is measured. The constraint < Sb < 1 is used in 
the fit. The results are shown in Tab. III. A simulta- 

Table III Fit to 6-tagging. R = 1 fixed and a is fixed 
to the value estimated with the swap method. Statistical 
uncertainties from the fit and from MC truth are included. 



sensitivity of the £{, measurement is about ±0.02 when 
R is varied by 5%. The fitting model is derived for ti 
events and the bias is estimated to be small, given that 
the background events are only 10% of the total sam- 
ple. The good agreement (within uncertainties) of the 
fit results with the MC truth values justifies this as- 
sumption. The uncertainty due to different ISR/FSR 
content in the final sample is expected to be small 
(< 1%). 



algorithm 


working point Eb (MC truth) 


Eb 




loose 


0.82 ±0.01 


0.81 ±0.02 


Jet Probability 


medium 


0.63 ±0.01 


0.63 ±0.02 




tight 


0.41 ±0.01 


0.41 ±0.02 




loose 


0.80 ±0.01 


0.82 ±0.02 


Track Counting 


medium 


0.65 ±0.01 


0.65 ±0.02 




tight 


0.40 ±0.01 


0.41 ±0.02 



neous fit to the 6-tagging efficiency and ao yields the 
2-dimensional distribution shown in Figure 5. The 
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Figure 5: Contour plot of the fit to b-tagging efficiency 
and ao. R = 1 fixed and a is fixed to the value estimated 
with the swap method. 

total systematic uncertainty is 4% and is due to the 
uncertainty on a. The uncertainty is estimated by 
repeating the fit procedure after displacing each pa- 
rameter by positive/negative values from a Gaussian 
distribution centered at zero with a width given by 
the corresponding uncertainty of the parameter. The 



5. Conclusions 

Two studies of feasibility of the R measurement 
was presented, one by using selected semi-leptonic ti 
events and the other by using selected di-leptonic ti 
events in the e/i channel. The expected uncertain- 
ties, for the semi-leptonic channel with L = 1 fb _1 , 
are an(stat) = 0.12 and aji(sys) = 0.11. For the di- 
leptonic channel, with L = 250 pb _1 , the expected 
uncertainty is an(stat ± sys) — 0.09. Furthermore, 
in the dileponic channel a study on the 65 measure- 
ment, fixing R to the SM value, has been performed. 
The expected uncertainties are: a tb (stat) — 0.02 
a €b (sys) ~ 0.04. Both the studies use data driven 
methods to subtract the background contribution. 
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